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globally. Some individuals experience systemic symptoms post-vaccination, which overlap with COVID-19 
symptoms. This study compared early post-vaccination symptoms in individuals who subsequently tested 
positive or negative for SARS-CoV-2, using data from the COVID Symptom Study (CSS) app. 

Methods: We conducted a prospective observational study in 1,072,313 UK CSS participants who were 
asymptomatic when vaccinated with Pfizer-BioNTech mRNA vaccine (BNT162b2) or Oxford-AstraZeneca 
adenovirus-vectored vaccine (ChAdOx1 nCoV-19) between 8 December 2020 and 17 May 2021, who subse- 
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Side-effects quently reported symptoms within seven days (N=362,770) (other than local symptoms at injection site) and 
Self-reported symptoms were tested for SARS-CoV-2 (N=14,842), aiming to differentiate vaccination side-effects per se from superim- 
Mobile technology posed SARS-CoV-2 infection. The post-vaccination symptoms and SARS-CoV-2 test results were contempora- 
Early detection neously logged by participants. Demographic and clinical information (including comorbidities) were 
severe acute respiratory syndrome—related recorded. Symptom profiles in individuals testing positive were compared with a 1:1 matched population 


coronavirus 2 (SARS-CoV-2) testing negative, including using machine learning and multiple models considering UK testing criteria. 


Findings: Differentiating post-vaccination side-effects alone from early COVID-19 was challenging, with a 
sensitivity in identification of individuals testing positive of 0.6 at best. Most of these individuals did not 
have fever, persistent cough, or anosmia/dysosmia, requisite symptoms for accessing UK testing; and many 
only had systemic symptoms commonly seen post-vaccination in individuals negative for SARS-CoV-2 (head- 
ache, myalgia, and fatigue). 
Interpretation: Post-vaccination symptoms per se cannot be differentiated from COVID-19 with clinical 
robustness, either using symptom profiles or machine-derived models. Individuals presenting with systemic 
symptoms post-vaccination should be tested for SARS-CoV-2 or quarantining, to prevent community spread. 
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Research in context 


Evidence before this study 


There are multiple surveillance platforms internationally 
interrogating COVID-19 and/or post-vaccination side-effects. 
We designed a study to examine for differences between vacci- 
nation side-effects and early symptoms of COVID-19. We 
searched PubMed for peer-reviewed articles published 
between 1 January 2020 and 21 June 2021, using 
keywords: "COVID-19" AND "Vaccination" AND ("mobile appli- 
cation" OR "web tool" OR "digital survey" OR "early detection" 
OR "Self-reported symptoms" OR “side-effects”). Of 185 results, 
25 studies attempted to differentiate symptoms of COVID- 
19 vs. post-vaccination side-effects; however, none used artifi- 
cial intelligence (AI) technologies (“machine learning”) coupled 
with real-time data collection that also included comprehen- 
sive and systematic symptom assessment. Additionally, none 
of these studies attempted to discriminate the early signs of 
infection from side-effects of vaccination (specifically here: 
Pfizer-BioNTech mRNA vaccine (BNT162b2) and Oxford-Astra- 
Zeneca adenovirus-vectored vaccine (ChAdOx1 nCoV-19)). Fur- 
ther, none of these studies sought to provide comparisons with 
current testing criteria used by healthcare services. 


Added value of this study 


This study, in a large community-based cohort, uses prospec- 
tive data capture in a novel effort to identify individuals with 
COVID-19 in the immediate post-vaccination period. Our 
results suggest that early symptoms of SARS-CoV-2 cannot be 
differentiated from vaccination side-effects robustly. 


Implications of all the available evidence 


Our study suggests that post-vaccination symptoms per se can- 
not be differentiated from COVID-19 with clinical robustness 
and therefore individuals presenting with systemic symptoms 
post-vaccination should be tested for SARS-CoV-2 to prevent 
community spread. 











1. Introduction 


The havoc wrought by SARS-CoV-2 is unprecedented in living 
memory, with >184 million cases of COVID-19 world-wide and 
>4.0 million deaths by 8 July 2021 [1,2]. Extraordinary efforts 
directed towards rapid vaccine development meant that by late 2020 
the UK Medicines and Healthcare products Regulatory Agency had 
authorized three vaccines: Pfizer-BioNTech mRNA (BNT162b2) [3,4] 
Oxford-AstraZeneca adenovirus-vectored [5-7] and Moderna mRNA 
(mRNA-1273) [8,9]. A fourth vaccine (Janssen adenovirus-vectored 
Ad26.COV2.S) was authorised on 28 May 2021 [10]. Vaccination with 
BNT162b2 (herein, PB) and ChAdOx1 nCoV-19 (herein, O-AZ) started 
in the UK on 8 December 2020 [11] and 4 January 2021 [12] respec- 
tively, during which time the UK was experiencing its third pandemic 
wave with widespread community transmission (peak UK positive 
specimens reported on 29 December 2020) [13]. Since then, and in 
the context of social distancing and stay-at-home directives, new 
infections, hospitalisations, and deaths from SARS-CoV-2 have fallen 
rapidly and remained low until June 2021 [1,2,13]. 

Local and systemic reactions have been observed after all vaccines 
for SARS-CoV-2. Considering the two vaccines used predominantly in 
the UK to date (O-AZ and PB), local reactions were common during their 
pivotal trials (76% of younger (<55 years) O-AZ recipients reported 


tenderness; [5,6] 83% of younger PB recipients reported pain) [4]. Sys- 
temic reactions were also common and included fatigue (O-AZ 76%; PB 
59%), headache (O-AZ 65%; PB 52%), and fever (O-AZ 24%; PB 16%) 
[4—6]. Observational data from the COVID Symptom Study (CSS) 
[14,15] also showed high incidence of local (62%) and systemic (26%) 
effects [16]. Other real-world experience has resulted in identification 
of some very rare but serious side effects, such as vaccine-induced 
thrombotic thrombocytopenia (VITT), associated with anti-PF4 anti- 
body production, and myocarditis [17—19]. Saliently, most vaccine- 
related side-effects (including VITT) are more common in younger indi- 
viduals, whereas COVID-19 clinical severity increases with age 
[4—6,16]. 

Prevention of SARS-CoV-2 dissemination requires rapid recognition 
followed by quarantining of infected individuals (along with appropri- 
ate health care). However, there is overlap between symptoms from 
COVID-19 [1,5,20,21] and early post-vaccination systemic symptoms 
[4—6,16]. Moreover, immunity to SARS-CoV-2 does not occur immedi- 
ately post-vaccination, [22] with functional protection from approxi- 
mately day 12 [23]. Quarantining and testing every individual with 
systemic symptoms early post-vaccination would be onerous, expen- 
sive, and labour-intensive — but given the impact of viral outbreaks 
might be unavoidable if SARS-CoV-2 infection cannot be excluded 
robustly [15,20]. 

Here we aim to determine whether symptom profiles can be used 
to differentiate individuals with systemic side-effects of vaccination 
alone from individuals with superimposed SARS-COV-2 infection. 


2. Methods 
2.1. Study design and Participants 


Data were acquired prospectively from the CSS, using a mobile 
health application launched by ZOE Limited and King’s College Lon- 
don in March 2020 (app details and development given in Supple- 
mentary Methods) [14,15]. Briefly, individuals are asked daily to log 
their health status, health care access, SARS-CoV-2 testing and 
results, and vaccination details, with direct questions about 
symptoms associated with COVID-19 (Supplementary Table S1) 
[14,15]. Symptomatic individuals are prompted to undergo testing, 
either through standard care or through test request from ZOE/CSS 
[24]. 

Data were acquired from UK participants aged 16-90 years, 
between 8 December 2020 (UK vaccination start date) and 17 May 
2021, who were asymptomatic when vaccinated with PB or O-AZ 
(first or second dose), and subsequently reported: a) at least one pre- 
defined symptom (Supplementary Table S1) within seven days post- 
vaccination, and b) a SARS-CoV-2 test result (reverse transcription 
polymerase chain reaction [rtPCR] or lateral flow antigen test [LFAT]) 
within ten days post-vaccination. The seven-day cut-off for symptom 
presentation was informed by: a) serial interval for COVID-19 (the 
time interval between the primary and secondary case; mean, 5.2 
days [25]); b) incubation period for SARS-CoV-2 (mean, 5.8 days 
[25,26]); c) the timeline for acute post-vaccination side-effects in 
both pivotal trials (one-week [4—6,8,10]) and d) reported real-world 
experience of post-vaccination symptoms (peak prevalence day 1 
post-vaccination; mean duration one day [16]). The ten-day cut-off 
for testing allowed three days’ delay in accessing testing [27]. 

Early results indicated a large imbalance in numbers of individuals 
testing positive vs. negative post-vaccination (3525:11317, positive: 
negative SARS-CoV-2 tested individuals), sufficient to bias analysis. 
[28] A 1:1 population from the negative cohort (matching age, BMI, 
gender, occupation, week of testing, and comorbidities) was selected 
based on minimisation of Euclidean distance between positive and 
negative subjects considering these features, enabling a fair compari- 
son between groups of equal size [29]. However, to ensure robust- 
ness, analyses were repeated using: a) a one-hundred bootstrapping 
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scheme selecting from the negative population; [30] and b) the entire 
negative population. 

Individual symptoms (here, a symptom reported at any time 
within seven days post-vaccination, irrespective of duration) were 
compared between recently vaccinated individuals testing positive 
or negative for SARS-CoV-2, using Chi-squared tests per symptom, 
given that normality was not present for most of the symptom’s dis- 
tribution. Duration of individual symptoms was calculated as days 
from first report of that symptom, until asymptomatic and/or seven 
days post-vaccination, noting that duration beyond seven days was 
not considered; however, as the number of individuals experiencing 
each symptom was low in both groups, no statistical comparison was 
made. Symptom burden, defined as total symptom count per person 
[irrespective of symptom duration] was compared between groups 
using Mann-Whitney-U tests. We also considered symptom manifes- 
tation across the week post-vaccination, by dynamic profiling for 
each symptom (symptom frequency). Correlation of individual symp- 
toms within both positive and negative individuals was assessed by 
computing a Spearman-rank correlation test. Local symptoms due to 
vaccination per se (Supplementary Table S2) were excluded from 
analysis as unlikely to be indicative of, or influenced by, SARS-CoV-2 
infection. 

Machine learning was used to determine if post-vaccination 
symptoms per se could be separated from superimposed SARS-CoV-2 


infection (including symptom combination, and cumulative symptom 
burden) [15,21,31]. We trained a set of binary classifiers to identify 
SARS-CoV-2-positive individuals. Models included random forest, 
logistic regression, and Bayesian mixed-effect models, exploiting 
their varying properties (Supplementary Table S3) to improve reli- 
ability of results. The models were trained using the outcome of the 
SARS-CoV-2 testing as response variable and ground truth for both 
training and model assessment (validation). We also considered 
whether symptoms occurred after the first or second vaccination, by 
including vaccination dose as a covariate. Models were trained on 
data without stratifying by vaccine type, due to small sample sizes. 
Further covariates included the age, gender, and BMI of the partici- 
pants (details available in Supplementary Table S3). We also did not 
discriminate between type of SARS-CoV-2 testing (PCR vs. LFAT), or 
mode of testing access (NHS vs. ZOE-request), either in model devel- 
opment or other analyses. 

We sought to reduce bias from assessing high numbers of individ- 
ual symptoms by performing symptom-clustering using K-means 
[32]. However, a relevant/accurate number of clusters was not evi- 
dent from the silhouette plot and entropy (data not shown); thus, 
further analyses using machine clustering were not pursued. Symp- 
toms were clustered manually into clinical groupings (reviewed by 
ELD, MO, TS, AH, CJS) (Supplementary Table S4), and analysed using 
the above models similarly. Lastly, illness classification according to 
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Figure 1. Flowchart of individuals included in this study. Symptoms* within 7 days excluded local symptoms related to injection site. SARS-CoV-2 test included both rtPCR and 
LFAT. Positive and negative refers to self-logged test results. DI: Data invalid. 1% and 2" dose refer to the first and second doses of the two vaccines, respectively. 


4 LS. Canas et al. / EClinicalMedicine 42 (2021) 101212 


Table 1 
Confusion matrix of the probable positive SARS-CoV-2 infections according to 
UK testing criteria. 





UK testing criteria 








Positive Negative 
SARS-CoV-2 SARS-CoV-2 
SARS-CoV-2 Positive SARS-CoV-2 62 88 
Testing result Negative SARS-CoV-2 3,463 11,229 





based on having at least one of the four symptoms required for 
accessing NHS testing during the timing of this study (viz., presence 
or absence of fever, persistent cough, anosmia and/or dysosmia) [24] 
were assessed. 

Data were split into training and validation sets for random forest, 
logistic regression, and Bayesian mixed-effect models. Five folds were 
used on the training set, composed of 80% of the initial dataset ran- 
domly selected, to train the models in different subsamples of the 
population. The remaining 20% were then used to assess the perfor- 
mance of models, evaluating sensitivity, specificity, and balanced 
accuracy. The class ratio was maintained in both training and testing 
sets. Although models based on either clinical clustering or categori- 
sation according to NHS criteria do not require training - and thus 
potentially both could be assessed using the full dataset - to ensure 
fair evaluation, both models (i.e., clinical clustering and categorisation 
according to NHS criteria) were assessed on 20% of the data of each 
cohort (i.e., the 20% of each fold corresponding to the testing set). 


3. Ethical approval 


The app and CSS were approved in the UK by KCL’s ethics commit- 
tee (REMAS no. 18210, review reference LRS-19/20—18210). All app 
users provided informed consent for use of their data for COVID-19 
research. 


4. Role of the funding source 
The funders, namely UK Government Department of Health and 
Social Care, Wellcome Trust, UK Engineering and Physical Sciences 


Research Council, UK National Institute for Health Research, UK Med- 
ical Research Council and British Heart Foundation and Chronic 


Table 2 


Disease Research Foundation and Zoe Limited had no role in study 
design, data analysis, data interpretation, or influence on report con- 
tent. ZOE Limited developed the app for data collection as a not-for- 
profit endeavour. 


5. Results 


Figure 1 shows the flow chart for this study. Overall, 1,072,313 UK 
CSS app users were vaccinated with either O-AZ or PB (O-AZ: 
713,651; PB: 358,662). Of these, 362,770 (33.8% (O-AZ: 264,587; PB: 
98,183)) reported at least one symptom early post-vaccination, with 
SARS-CoV-2 testing in 14,842 (4.09%) of these individuals (O-AZ: 
10,765; PB: 4,077). A positive test was reported by 150 (1.01%) indi- 
viduals (O-AZ: 73/10,765 (0.68%); PB: 77/4077 (1.89%)). 

Within the tested group, 3,525/14,842 (23.75%) reported at least 
one requisite symptom fulfilling UK testing criteria; [24] 62 (1.76%) 
tested positive. Conversely, 11,317 tested individuals did not report 
any requisite symptom, of whom 88 (0.78%) tested positive. Individu- 
als with requisite symptoms were more likely to test positive than 
those without (p-value<0.0001); none-the-less, the majority (88 of 
150, 59%) who tested positive did not meet current UK testing criteria 
(Table 1). 

For further analyses, one positive individual (vaccinated with PB) 
was excluded due to invalid data entry (invalid BMI), leaving 149 
symptomatic positively-tested individuals. Table 2 describes the pos- 
itive and matched negative cohorts. 

Four symptomatic individuals who tested positive did so within 
10 days of their second vaccination; their symptoms after first vacci- 
nation were thus disregarded. As the matched negative control 
cohort (N=149) included matching for vaccination order, the controls 
also included four individuals reporting symptoms after second vac- 
cination. Given that small sample sizes compromise machine learning 
and model training, all 149 subjects in each of the positive and nega- 
tive groups were included for either training or testing of the models 
(according to the percentages mentioned above). We regressed the 
impact of vaccination order, by including it as a co-variate of the 
models, or as a pre-processing step for the clinical clustering and 
NHS algorithm. However, we could not conduct a fair statistical anal- 
ysis of symptoms after first vs. second vaccination for symptom pro- 
filing, given the very small numbers of infected individuals 
presenting after second vaccination (i.e., 4 subjects). Thus, as post- 
vaccination symptoms may vary after first vs. second dose, [16,31] 


Demographic information of vaccinated individuals testing positive or negative for SARS-CoV-2 infection. Data are presented as median value [IQR] for age and 
BMI; and numbers (percentages) for other values. BMI: Body Mass Index. IQR: Inter-Quartile Range. 





Positive testing for SARS-CoV-2 infection 


Vaccinated Cohort 


Negative testing for SARS-CoV-2 infection 








O-AZ PB Full cohort O-AZ PB Full cohort 
Number 73 76 149 73 76 149 
Males (%) 27 (37.0) 21 (27.6) 48 (32.2) 22 (30.1) 19 (25.0) 45 (30.2) 
Age, years (median [IQR]) 62.0 [50.0; 71.0] 59.0 [50.0; 67.5] 61.0 [50.0; 70.0] 65.0 [54.0; 69.0] 64.0 [52.0; 71.3] 63.0 [52.0; 70.0] 
BMI (median [IQR]) 25.0 [22.7; 28.0] 26.1 [23.5; 29.3] 25.4 [23.4; 29.2] 24.9 [23.0; 28.1] 25.5 [23.3; 27.0] 25.2 [23.3; 28.6] 
Lung disease (%) 8 (11.0) 9 (11.8) 17 (11.4) 8 (11.0) 14 (18.4) 20 (13.4) 
Kidney Disease (%) 0(0.0 1(1.3 1(0.7 1(1.4) 0 (0.0) 1(0.7 
Diabetes (%) 4(5.5 5 (6.6 9 (6.0 4 (5.5) 3 (3.9) 9(6.0 
Heart Disease (%) 7 (9.6 4(5.3 11 (7.4) 2 (2.7) 4 (5.3) 5(3.4 
Cancer (%) 0(0.0 4(5.3 4(2.7 4 (5.5) 3 (3.9) 6 (4.0 
Healthcare workers (%) 0(0.0 10 (13.2) 10 (6.7) 0(0.0) 7 (9.2) 2(1.3 
Visit to hospital (%) 1(1.4 1(1.3 2(13 1(1.4) 0 (0.0) 2(13 
Ethnicity: White 73 (100.0) 75 (98.7) 148 (99.3) 71 (97.2) 75 (98.7) 143 (96.0) 
Ethnicity: Black 0(0.0 0(0.0 0(0.0 0 (0.0) 1 (1.3) 1 (0.7 
Ethnicity: Asian 0 (0.0 0 (0.0 0 (0.0 1 (1.4) 0 (0.0) 2(13 
Ethnicity: Middle Eastern 0(0.0 1(1.3 1(0.7 0(0.0) 0 (0.0) 0(0.0 
Ethnicity: Mixed 0(0.0 0(0.0 0(0.0 0 (0.0) 0 (0.0) 0 (0.0 
Ethnicity: Other/Prefer not to say 0 (0.0 0 (0.0 o (0.0 1(1.4) o0 (0.0) 3 (2.0 
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Figure 2. Profiles of illness in symptomatic individuals early post-vaccination, comparing symptom prevalence (symptom reported at any time forforfor first week) in positive vs. 


negative cases (1:1 matched population; N=145 for each). *p < 0.05 **p < 0.01. 


we present the symptom profiles after first vaccination (N=145) in 
the main text, with data from the entire cohort (N=149) in the Sup- 
plementary Materials (Supplementary Table S5). 

Individual symptom prevalence after first vaccination is shown in 
Figure 2 and Supplementary Table S5. Although some symptoms 
were more common in individuals testing positive vs. negative (sore 
throat (p-value = 0.0187), sneezing (p value = 0.0474) and persistent 
cough (p-value = 0.0396)), others were more common in the negative 
group (palpitations (p-value = 0.0284). The numbers of individuals 
reporting each symptom were small (e.g., sore throat, n=17; persis- 
tent cough, n =12) (Supplementary Table S5). 

Median day of onset for any symptom post-vaccination was Day 1 
in both groups (noting all individuals were asymptomatic when vac- 
cinated), with highest symptom burden on Day 3, again in both 
groups (Figure 3; and Supplementary Tables S6 and S7). There was 


no difference in symptom burden between individuals testing posi- 
tive or negative (median: 12 in positive group, 10.5 in negative group, 
Mann-Whitney test p-value = 0.22). 

Table 3 presents prevalence of each symptom over the week post- 
vaccination, divided into three windows. Some symptoms increased 
over time in both positive and negative individuals (e.g., headache, 
myalgia) whereas others increased in positive individuals only (e.g., 
sneezing, hoarse voice). Although fever and sore throat increased 
across the week in the negative individuals, there was a suggestion of 
a biphasic response in the positive individuals, also observed with 
persistent cough. The numbers of individuals were too small for for- 
mal testing; moreover, the exact date of infection in positive individ- 
uals was unknown. 

Individual symptom duration is shown in Supplementary Table 
S8. There were no significant differences in symptom duration in 
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Figure 3. Symptom prevalence and distribution during the first week after the first dose of vaccination, in symptomatic individuals testing positive or negative for SARS-CoV-2. The 
colour bar represents the percentage of symptomatic individuals reporting each symptom. 


Table 3 

Symptom prevalence and timing during the first week post-vaccination (N=145), in symptomatic 
individuals testing positive or negative for SARS-CoV-2, grouped by days post-vaccination. Number 
of subjects and percentage (value in parenthesis). 





Positive SARS-CoV-2 Negative SARS-CoV-2 








Window 0-2 days 2-4 days 4-7days 0-2 days 2-4 days 4-7days 
Headache 22(15.2)  44(30.3)  66(45.5 31(21.4)  46(31.7) 66 (45.5) 
Chills or Shivers 13 (9.0) 25(17.2)  34(23.4 21(14.5) 25(17.2)  26(17.9) 
Myalgias A(2.8 12 (8.3) 26 (17.9 7 (4.8) 7(48 20 (13.8) 
Fever 14 (9.7) 10 (6.9) 21(14.5 11 (7.6) 14 (9.7) 20 (13.8) 
Dizziness 4(2.8 13 (9.0) 16 (11.0 6 (4.1) 21(14.5) 20(13.8) 
Sneezing 2(1.4 8 (5.5 18 (12.4 1(0.7) 8(5.5 6 (4.1 
Nausea 5 (3.4 8(5.5 11 (7.6) 6 (4.1) 15(10.3) 19(13.1) 
Sensitive Skin 0(0.0 6(4.1 3 (2.1) 0 (0.0) 2(14 2(14 
Rhinorrhoea A(2.8 10 (6.9) 21(14.5 1(0.7) 7(48 19 (13.1) 
Eye Soreness 1(0.7 6(4.1 11 (7.6) 4 (2.8) 5 (3.4 8(5.5 
Loss of Appetite 6(4.1 3 (2.1 19 (13.1 5 (3.4) 6(4.1 7 (4.8 
Chest Pain 0 (0.0 3 (2.1 14 (9.7) 1 (0.7) 2(14 5 (3.4 
Sore Throat 9(6.2 5 (3.4 30 (20.7 2 (1.4) 6(4.1 14 (9.7) 
Brain Fog 0(0.0 4(2.8 16 (11.0 5 (3.4) 5 (3.4 10 (6.9) 
Hoarse Voice 1(0.7 2(14 15 (10.3 1(0.7) 2(14 3 (2.1 
Anosmia 0 (0.0 4 (2.8 2 (1.4) 0 (0.0) 2(14 0(0.0 
Abdominal Pain 3 (2.1 3(21 16 (11.0 2 (1.4) 7(48 15 (10.3) 
Diarrhoea 3(2.1 A(2.8 11 (7.6) 2 (1.4) 3 (2.1 9 (6.2 
Dysosmia 1 (0.7 3 (2.1 12 (8.3) 3 (2.1) 1 (0.7 8 (5.5 
Ear Pain 0 (0.0 3 (2.1 3 (2.1) 1 (0.7) 3 (2.1 4 (2.8 
Red Welts FL 0 (0.0 1 (0.7 0 (0.0) 0 (0.0) 0 (0.0 1 (0.7 
Fatigue 1 (0.7 3 (2.1 10 (6.9) 6 (4.1) 2(14 2(14 
Persistent Cough 4(2.8 1(0.7 17(11.7) —0(0.0) 2(14 3 (21 
Swollen Glands 0(0.0 1(0.7 6 (4.1) 3 (2.1) 3 (2.1 1 (0.7 
Blisters on Feet o (0.0 0 (0.0 4 (2.8) o (0.0) o (0.0 o0 (0.0 
Dyspnoea 0 (0.0 0 (0.0 0 (0.0) 0 (0.0) 0 (0.0 o (0.0 
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Table 4 

Model performance in the classification of COVID-19 status according to post-vaccina- 
tion symptoms. Median values and percentiles [0.25 and 0.75] are obtained across five 
folds. AUC - area under curve, in a receiver operating characteristic analysis. 





Sensitivity Specificity ROC - AUC 





Bayesian Mixed-Effect 
Model 

Logistic Regression 

Random Forest 

Symptom clustering 

NHS screening criteria 


0.52 [0.47; 0.56] 0.55 [0.47; 0.60] 0.52 [0.47; 0.56] 


0.63 [0.58; 0.67] 0.67 (0.60; 0.72] 0.62 [0.58; 0.67] 
0.61 [0.58; 0.64] 0.63 [0.56; 0.69] 0.66 [0.61; 0.70] 
0.51 [0.49; 0.56] 0.67 (0.60; 0.73] 0.50 [0.47; 0.55] 
0.48 [0.48; 0.48] 0.62 [0.61; 0.63] 0.47 [0.47; 0.48] 





positive vs. negative individuals after first vaccination. Importantly, 
symptom assessment was truncated at seven days, noting as above 
that some symptoms were increasing in prevalence with time. 
Amongst individuals testing negative, dysosmia and delirium had the 
longest duration (median, 2 days for each). 

Similar results for symptom prevalence after first vaccination 
were obtained comparing the positive population with the con- 
structed cohort (1:1 matched) of negative individuals selected by 
bootstrapping (Supplementary Table S9) (Supplementary Figure S1); 
and with the negative population as a whole (Supplementary Figure 
S2). Some symptoms were significantly more common in negative 
individuals when using the entire negative population (e.g., brain 
fog), driven by the extremely large negative sample size, which sup- 
ports our use of a selected matched population to avoid bias from 
unbalanced sample size. 

There was no significant correlation between symptoms in either 
the positive or negative populations, assessed using Spearman-rank 
test (Supplementary Figure S3). As a sensitivity analysis we assessed 
the impact of a (self-logged) previous COVID-19 diagnosis; this did 
not alter our conclusions 


6. SARS-CoV-2 test outcome prediction modelling 


Model performance including receiver operator curves, using all 
reported symptoms, are shown in Table 4 and Figure 4. The best 
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performance was obtained with random forest, followed by logistic 
regression; however, neither reached clinical utility (conventionally, 
80%). Other models, including clinical symptom clustering (Supple- 
mentary Table S4) and categorisation of individuals using NHS 
screening criteria, were no better than chance. 


7. Discussion 


Here we aimed to develop a clinically useful algorithm predictive 
of SARS-CoV-2 infection early post-vaccination, by parsing symptoms 
according to proven infection status in symptomatic individuals. 
Such an algorithm would be extremely useful, particularly in coun- 
tries with limited health resources, as testing could be targeted 
towards those predicted positive, with quarantining of these individ- 
uals until an available result. To our knowledge, this is the first study 
with this aim. However, we were unable to differentiate post-vacci- 
nation symptoms per se from superimposed SARS-CoV-2 infection 
robustly. Although two models, LR and RF, showed ROC AUC signifi- 
cantly greater than 0.5, neither came close to approaching clinical 
utility - for most clinical tests, conventionally given as 0.8; but for a 
highly infectious agent with devastating consequences from commu- 
nity spread the necessary AUC is much higher. We consider employ- 
ing more complex model to improve the current results; however, 
we concluded that our sample size limited the use of such models. 
Were a larger dataset to be available, we agree this would be a poten- 
tial approach. 

Although one third of the one million vaccinated app users 
reported symptoms previously associated with COVID-19 early post- 
vaccination, only 4% of symptomatic individuals reported testing for 
SARS-CoV-2 even with allowance for delayed testing access. Consid- 
ering those individuals who reported at least one of the symptoms 
fulfilling NHS criteria for testing (266,502 overall), 40% (107,929) 
were tested. During the study period, testing was widely available in 
the UK and it is unclear why more symptomatic people (including 
those with the widely advertised core symptoms of fever, persistent 
cough, and anosmia/dysosmia) were not tested [33]. Possible reasons 
for not testing, even among individuals presenting any of the core 
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Figure 4. ROC-AUC performance for the different models. Mean value (line) and 95% CI (shadow area) of the models’ performance, given the predictions obtained over the cross- 
validation scheme (5-folds) adopted for the validation. DMEM: Bayesian Mixed-effect Model (blue); LR: Logistic Regression (yellow); RF: Random Forest (red); Symptom clustering 


(green) and NHS screening criteria (grey). 
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symptoms, include the lack of knowledge about when and where to 
test, as well as the absence of severe and/or multiple symptoms [33]. 
Additionally, there is currently no specific guidance given to vaccine 
recipients either highlighting the possibility of post-vaccination 
infection or on when to access testing in the post-vaccine period, 
both of which might also affect the decision of going for testing 
among vaccinated individuals [33,34]. Conversely, of 149 individuals 
who tested positive, only 62 (41%) had symptoms that met current 
UK testing criteria. We do not know why the other 88 positive indi- 
viduals were tested (e.g., contact tracing, routine workplace testing, 
direct personal request through the app). 

Our data also suggest sensitivity of using core symptoms to justify 
testing for COVID-19 may be lower post-vaccination than in pre-vac- 
cination times (here 48%, previously 73%.) [35]. Although individuals 
with core symptoms were more likely to test positive than those 
without, the overall sensitivity and AUC suggests current UK testing 
policy is suboptimal for pandemic management particularly now that 
rapid testing capacity is much greater than when these criteria were 
established [35]. Notably, current UK testing criteria are more limited 
than WHO guidelines [2] and those of many other jurisdictions of 
similar GDP (including France, Germany, USA, and Australia). 

Although there were some differences in symptom prevalence 
and distribution between positive and negative individuals, these 
could not be used robustly to discriminate between groups, including 
using machine-learning. We also considered time of symptom onset 
and symptom duration post-vaccination, (previous trials and post- 
marketing observational data have examined these parameters but 
not with respect to SARS-CoV-2 status) [4—6,8,16]. Whether positive 
or negative, median symptom peak burden was day 3 in both groups, 
concordant with vaccination side-effect profiles reported previously. 
4-68.16 As time progressed, some symptoms appeared to become 
more common in the positive group only (e.g., persistent cough, 
hoarse voice), the timing of which coincides with the serial interval 
and incubation period of SARS-CoV-2 [26]. Note that no formal statis- 
tical analyses were undertaken on this point, and the results regard- 
ing symptom duration are descriptive only. Future work should 
assess Statistical differences in symptom duration for the two groups, 
in larger cohorts. However, the critical public health importance of 
identifying and isolating cases early, and the lack of clear-cut differ- 
ences between infected and non-infected symptomatic individuals, 
does not allow the luxury of a watch-and-wait approach. 

We do not know the circumstances contributing to infection of 
the positive group (whether prior to, peri-, or immediately post-vac- 
cination), noting here that our requirement for a positive rtPCR or 
LFAT result confirmed only recent infection. The serial interval and 
incubation period for SARS-CoV-2, and the high prevalence of asymp- 
tomatic infection, mean individuals could have been infected before 
vaccination. Although current UK vaccination guidelines do not 
require individuals to be completely asymptomatic at time of vacci- 
nation (only not “acutely unwell”), [36] our inclusion criteria required 
individuals to be asymptomatic at time of vaccination. It is also possi- 
ble that infection was contracted whilst getting vaccinated. Nosoco- 
mial infection with SARS-CoV-2 has been reported in the UK, [37,38] 
including many health care workers infected at their workplace. 
[39,40]. Here we would emphasise strongly that data from ourselves 
and others indicate that even if infected peri- or post-vaccination the 
course of COVID-19 is much less severe in vaccinated vs. unvacci- 
nated individuals; [41—45] and - acknowledging that a small per- 
centage of vaccinated and symptomatic individuals were tested - our 
data demonstrated only 150 cases of confirmed infection (1%) in 
14,842 tested individuals from over 1 million vaccinated app users. It 
is also possible that individuals changed their behaviour immediately 
after vaccination, increasing infection risk and contracting SARS-CoV- 
2 prior to acquiring adequate immunity. Here, data from previous 
infections suggest vaccination reduces adherence to other public 


health measures, [46] with pre-prints suggesting that this is also 
occurring after vaccination against SARS-CoV-2 [47,48]. 

Overall, CSS app users are not fully representative of the UK popu- 
lation (younger, more likely to be female, of higher educational sta- 
tus, lack of ethnical minorities, and over-representative of healthcare 
workers). '° Although the population in the current study shares 
some of these biases, the median age of vaccinated individuals at the 
time of our analysis (64 years) was older than for app users overall 
(47 years), which is not surprising as the UK vaccination schedule 
began with the oldest individuals in the community. We considered 
the implications of this with respect to the likelihood of an infected 
person presenting for testing: although asymptomatic SARS-CoV-2 
infection is well-recognised, it is less common in older people [35]. 
We also acknowledge that different economic and cultural experien- 
ces may influence presentation of SARS-CoV-2 infection and report- 
ing of post-vaccination side-effects.*°Our sample is mainly composed 
of White individuals, which may affect applicability of our results to 
populations with different ethnicities. The inclusion of other ethnici- 
ties was not possible, as we did not have any individuals who fulfilled 
the entry criteria who identified as Black, Asian or Minority Ethnicity. 
We are unsure whether this reflects the known population bias 
amongst app participants, the strict criteria for inclusion in this study, 
and/or other reason [33,46]. 

Our approach in comparing symptom profiles for individuals test- 
ing positive or negative for SARS-CoV-2 required a 1:1 matched pop- 
ulation, so that comparison of symptom prevalence was fair and 
unbiased by the greatly different sample sizes of the two populations. 
However, this methodological choice is less reliable when used for 
the outcome of SARS-CoV-2 test prediction; and the forced balance of 
the classes can lead to an overestimation of the likelihood of being 
positive in the modelling. Thus, we have also presented extended 
analyses, using both boot-strapping, and entire-cohort approaches in 
the Supplementary Results. We acknowledge that the predictive 
power of our optimised models may be hampered if there are brand- 
specific post-vaccination side-effects, which were not considered 
during model optimisation [16,31]. However, although there were 
some differences in frequencies, most symptoms were reported in 
both PB and O-AZ pivotal trials [4,5,8,10]. We also did not consider 
type of SARS-CoV-testing (rtPCR vs. LFAT, noting here that LFAT has 
been confirmed to be an accurate alternative to rtPCR testing, [49,50] 
particularly in symptomatic individuals, with a specificity of 89.1% 
(95%CI: [86.3%, 91.9%]) and sensitivity of 5.4% (95%CI: [94%,96.8%])), 
or mode of testing access (NHS vs. ZOE-request), may also contribute 
variability to these models [50]. Here, our modest numbers preclude 
sensitivity testing. To avoid potential bias in the assessment of the 
performance of the models, we kept the same ratio of the population 
tested with either SARS-CoV-2 test in both training and validation 
sets. Lastly, we did not consider the different vaccine type in our 
models, since the number of individuals having each vaccine is 
approximately the same (O-AZ: 73, PB: 77), the bias caused by the 
vaccine type is almost insignificant. 

Our analyses do not consider the impact of COVID-19 prevalence 
in UK at the time of the vaccination. Positive and negative predictive 
values (PPV, NPV) for a test depend not only on test sensitivity and 
specificity but also on population prevalence of disease. The rapidly 
changing prevalence of SARS-CoV-2 infection in the UK and the pace 
of vaccination delivery over the time period of this study limits our 
capacity to provide accurate PPV and NPV. Further analyses, particu- 
larly in populations with higher prevalence of infection and/or higher 
symptom burden and severity of COVID-19, may result in better dif- 
ferentiation of early signs of infection from post-vaccination side- 
effects. 

A strength of our study was our very large cohort of vaccinated 
participants, in a country that was an early adopter of vaccination; 
and our timeframe included the UK pandemic “third wave”. Prospec- 
tive real-time symptom logging through the app minimised recall 
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bias; and our symptom assessment included direct ascertainment of 
core symptoms for accessing UK testing. However, the sharp decline 
in cases in the first six months of 2021 resulted in only 149 positive 
cases to inform our modelling. We acknowledge this number is small 
- though we were also able to draw upon large numbers of tested 
negative individuals for comparisons, reinforcing the consistency and 
generalisation of our results. Additionally, in this observational study 
no formal a priori sample size calculation was computable; posterior 
analysis concluded the necessary sample size for this study was 31 
individuals in each population (positive versus negative COVID-19 
infected individuals), for 95% confidence interval for the difference in 
proportions who are positive versus negative with a margin of error 
no more than 5%; with our population size (149 individuals per 
group), our study has a power >0.80. Power analysis for the compari- 
son between doses was also not computed, since we considered that 
four subjects are not representative of the vaccinated population. We 
also acknowledge that the demographic features of the app popula- 
tion especially those parameters considered for model estimation 
(e.g., age, gender, BMI) may be different in other populations within 
the UK and elsewhere. 

The implication of our results will vary depending on the popula- 
tion prevalence of SARS-CoV-2 and pace of vaccination roll-out. For 
example, at the time of writing New Zealand has negligible commu- 
nity spread of SARS-CoV-2 and is still early in vaccination roll-out. It 
would be relevant to repeat this study in these different circumstan- 
ces to maximise its utility to different populations, noting that the 
translation of this study to other social and demographic contexts 
could be challenging. The use of new technologies (either the app 
used here [modified for country of use, according to language, cul- 
tural context and/or other unique population features] or other tech- 
nologies) may not be feasible in some countries, particularly those 
without policies for safe use of personal and medical information for 
research. We also note that testing of all symptomatic individuals 
comes at a cost (e.g., testing kits, infrastructure). On the other hand, 
our data show that individuals manifesting symptoms post-vaccina- 
tion cannot be assumed to be uninfected (and, thus, non-infectious). 
Providing a testing kit to all individuals post-vaccination, with guid- 
ance as to when to test, may or may not prove tenable, noting here 
the widespread use of lateral flow testing currently in the asymptom- 
atic population. The UK is a resource-rich country; the impact of our 
results in countries with fewer health resources needs careful consid- 
eration. 

In conclusion, post-vaccination symptoms cannot be distin- 
guished with clinical confidence from early SARS-CoV-2 infection. 
Our study highlights the critical importance of testing symptomatic 
individuals - even if recently vaccinated — to ensure early detection 
of SARS-CoV-2 infection and help prevent future waves of COVID-19. 
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